ACL Anthology Network (All About NLP) (beta) The Association Of Computational Linguistics Anthology Network |
ACL ID | P11-2092 |
---|---|
Title | Improved Modeling of Out-Of-Vocabulary Words Using Morphological Classes |
Venue | Annual Meeting of the Association of Computational Linguistics |
Session | Main Conference |
Year | 2011 |
Authors |
We present a class-based language model that clusters rare words of similar morphology together. The model improves the predic- tion of words after histories containing out- of-vocabulary words. The morphological fea- tures used are obtained without the use of la- beled data. The perplexity improvement com- pared to a state of the art Kneser-Ney model is 4% overall and 81% on unknown histories.