Paper: Reducing Weight Undertraining In Structured Discriminative Learning

ACL ID N06-1012
Title Reducing Weight Undertraining In Structured Discriminative Learning
Venue Human Language Technologies
Session Main Conference
Year 2006
Authors

Discriminative probabilistic models are very popular in NLP because of the latitude they afford in designing features. But training involves complex trade-offs among weights, which can be dangerous: a few highly- indicative features can swamp the contribution of many individually weaker features, causing their weights to be undertrained. Such a model is less robust, for the highly-indicative features may be noisy or missing in the test data. To ameliorate this weight undertraining, we intro- duce several new feature bagging methods, in which separate models are trained on subsets of the original features, and combined using a mixture model or a product of experts. These methods include the logarithmic opinion pools used by Smith et al. (2005). We evaluate fea- ture bagging on linear-chain ...