Paper: Tied-Mixture Language Modeling in Continuous Space

ACL ID N09-1052
Title Tied-Mixture Language Modeling in Continuous Space
Venue Human Language Technologies
Session Main Conference
Year 2009

This paper presents a new perspective to the language modeling problem by moving the word representations and modeling into the continuous space. In a previous work we in- troduced Gaussian-Mixture Language Model (GMLM) and presented some initial experi- ments. Here, we propose Tied-Mixture Lan- guage Model (TMLM), which does not have the model parameter estimation problems that GMLM has. TMLM provides a great deal of parameter tying across words, hence achieves robust parameter estimation. As such, TMLM can estimate the probability of any word that has as few as two occurrences in the train- ing data. The speech recognition experiments with the TMLM show improvement over the word trigram model.