Paper: Language Modeling With Sentence-Level Mixtures

ACL ID H94-1014
Title Language Modeling With Sentence-Level Mixtures
Venue Human Language Technologies
Session Main Conference
Year 1994

This paperintroduces a simple mixtare language model that attempts to capture long distance conslraints in a sentence or paragraph. The model is an m-component mixture of Irigram models. The models were constructed using a 5K vocabulary and trained using a 76 mil- lion word Wail Street Journal text corpus. Using the BU recognition system, experiments show a 7% improvement in recognition accu- racy with the mixture trigram models as compared to using a Irigram model.