Paper: Automatic Model Refinement - With An Application To Tagging

ACL ID C94-1023
Title Automatic Model Refinement - With An Application To Tagging
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1994

Statistical NLP models usually only consider coarse information and very restricted context to make the estimation of parameters feasible. To reduce the modeling error introduced by a sim- plified probabilistic model, the Classitication and Regression Tree (CART) method was adopted in this paper to select more discriminative features for automatic model refinement. Because the features are adopted dependently during split- ting the classification tree in CART, the number of training data in each terminal node is small, which makes the labeling process of terminal nodes not robust. This over-tuning phenome- non cannot be completely removed by cross- validation process (i.e. , pruning process). A probabilistic classification model based on the selected discriminative features is thtls propos...