Paper: Integrating history-length interpolation and classes in language modeling

ACL ID P11-1152
Title Integrating history-length interpolation and classes in language modeling
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

Building on earlier work that integrates dif- ferent factors in language modeling, we view (i) backing off to a shorter history and (ii) class-based generalization as two complemen- tary mechanisms of using a larger equivalence class for prediction when the default equiv- alence class is too small for reliable estima- tion. This view entails that the classes in a language model should be learned from rare events only and should be preferably applied to rare events. We construct such a model and show that both training on rare events and preferable application to rare events improve perplexity when compared to a simple direct interpolation of class-based with standard lan- guage models.