Paper: Estimation Of Stochastic Attribute-Value Grammars Using An Informative Sample

ACL ID C00-1085
Title Estimation Of Stochastic Attribute-Value Grammars Using An Informative Sample
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2000
Authors
  • Miles Osborne (University of Groningen, Groningen The Netherlands)

We argue that some of the computational complexity associated with estimation of stochastic attribute- value grammars can be reduced by training upon an informative subset of the full training set. Results using the parsed Wall Street Journal corpus show that in some circumstances, it is possible to obtain better estimation results using an informative sam- ple than when training upon all the available ma- terial. Further experimentation demonstrates that with unlexicalised models, a Gaussian prior can re- duce overfitting. However, when models are lexi- ealised and contain overlapping features, overfitting does not seem to be a problem, and a Gmlssian prior makes minimal difference to performance. Our ap- proach is applicable for situal;ions when there are an infeasibly large mnnber of pa...