Paper: Confidence-Weighted Learning of Factored Discriminative Language Models

ACL ID P11-2077
Title Confidence-Weighted Learning of Factored Discriminative Language Models
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

Language models based on word surface forms only are unable to benefit from avail- able linguistic knowledge, and tend to suffer from poor estimates for rare features. We pro- pose an approach to overcome these two lim- itations. We use factored features that can flexibly capture linguistic regularities, and we adopt confidence-weighted learning, a form of discriminative online learning that can better take advantage of a heavy tail of rare features. Finally, we extend the confidence-weighted learning to deal with label noise in training data, a common case with discriminative lan- guage modeling.